-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-3040] pick up a more proper local ip address for Utils.findLocalIpAddress method #1946
Conversation
get a more proper local ip address.
Can one of the admins verify this patch? |
I have seen cases where a bad interface was chose, so this does seem like a good idea. But for windows, does this mean that the wrong interface is chosen? Since we support windows, should we add a check here to confirm we are on a unix-like system? |
Sorry, I didn't realize windows is supported. In that case, I believe a check is necessary. I will update the pr. |
@pwendell, would you look at this? It's a fairly simple fix. I don't have windows for primary use, so it's not confirmed on windows. I hope someone who uses windows can confirm this behavior. |
Is this relying on documented behaviour or observed evidence ?
|
I do agree with @mridulm it's a somewhat risky change so I think it's out of scope to put it in 1.1. I think that it's a native method with no stated guarantees about ordering. |
@pwendell @mridulm no, this behavior is not documented. The java SE doc doesn't say anything about ordering.. So before filing the JIRA, I looked up the OpenJDK source code, both 6 and 7.
jdk/src/solaris/native/java/net/NetworkInterface.c function *addif Line1000
Of course, the whole flow is more complicated, but you can get the idea. |
And @pwendell, I don't think this is risky. It's a minor change and at least it don't do worse. |
By risky I mean this - many existing users of Spark might depend on the current behavior, so if we suddenly chose a different interface it will break existing behavior and it could be non-obvious to users why this is happening. |
Thanks for digging into this ! From what I see, this is an implementation detail of a specific jdk version Which is not to say we can't include it in some future release of spark ,
|
@mridulm, could you give me the link to report a jdk bug? It's very confusing that jdk has many different jira place (OpenJDK jira, Oracle etc.). After second thought, maybe it's a bad idea to rely on the output order. Maybe we should iterate over all interfaces and get the best(or reasonable) ip. |
Not sure, https://bugs.openjdk.java.net/browse/JDK ? On Sun, Aug 17, 2014 at 8:54 AM, 叶先进 [email protected] wrote:
|
Can one of the admins verify this patch? |
On this one I'm a little torn - mostly because I've directly experienced this issue and also I've seen users run into it, so I think it's a fairly severe issue in terms of how we chose the default. The current behavior is indeed backwards in the most common case (someone running oracle or openJDK on Linux/Unix). For that reason, I'm inclined to merge this and see if anyone runs into any case where this somehow regresses behavior, and if so we can consider reverting it for 1.2. When we are considering things like this it's generally good to merge them early in a release cycle so it gets a lot of testing. |
I'd love to see this get merged into 1.2. |
I'm not for or against the fix, but just wanted to point out that event the order the OS returns the interfaces is not really reliable. On my machine, where I constantly switch between wired and wireless, and often have both on, the order of eth0 vs. wlan0 seems to depend on the order they got connected more than anything. So sure, the change sounds fine if it helps with a common case (or at least to make windows and unix behave more alike?), but really I don't see how anyone can rely on any behavior here since there is really no expected behavior of the underlying api. |
In that case I'd propose merging this tentatively and if it causes issues in the 1.2 dev/QA cycle we can revert it. I dug around a bunch, it looks like there really isn't clear behavior here, even within Linux variants because the code used to do this had to change a lot for IPv6 support.There isn't an ordering specification anywhere I can find in the internal API's in BSD/POSIX sockets that relate to this. So it could be that this patch helps some and hurts others. As such, I'd be inclined to try it out and rollback if we find anyone is hurt by this (if it's not a strict win then we should bias towards the old behavior). |
Alright - let's see how this goes and we can rollback if we see any issues. |
… they are not equal (apache#1946) -- Allow SPJ between 'compatible' bucket funtions -- Add a mechanism to define 'reducible' functions, one function whose output can be 'reduced' to another for all inputs. ### Why are the changes needed? -- SPJ currently applies only if the partition transform expressions on both sides are identifical. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added new tests in KeyGroupedPartitioningSuite ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#45267 from szehon-ho/spj-uneven-buckets. Authored-by: Szehon Ho <[email protected]> Signed-off-by: Chao Sun <[email protected]> Co-authored-by: Szehon Ho <[email protected]>
Short version: NetworkInterface.getNetworkInterfaces returns ifs in reverse order compared to ifconfig output. It may pick up ip address associated with tun0 or virtual network interface.
See SPARK_3040 for more detail